Exploiting transitions and focussing on linguistic properties for ASR
نویسندگان
چکیده
This paper describes three cross-language ASR experiments which use hidden Markov modelling. The first one shows that consonant identification improves when vowel transitions are used. In particular, the consonants’ place of articulation is identified better, because the vowel transitions contain formant trajectories which depend on the consonant’s place of articulation. The second experiment compares consonant identification results when acoustic parameters belonging to the consonant itself (no vowel transitions are used in the second experiment) are used as input to hidden Markov modelling directly with identification rates when acousticphonetic mapping is performed before applying hidden Markov modelling. It is shown that acoustic-phonetic mapping greatly improves consonant identification rates. In the third experiment, the acoustic parameters from the vowel transitions are also mapped onto consonantal ( not vocalic) features, as are the acoustic parameters belonging to the consonants. The additional use of vowel transitions does not lead to further improvements in the consonant identification, however. This is probably due to undertraining of the vowel transitions in the Kohonen network.
منابع مشابه
Magnetic Properties and Phase Transitions in a Spin-1 Random Transverse Ising Model on Simple Cubic Lattice
Within the effective-field theory with correlations (EFT), a transverse random field spin-1 Ising model on the simple cubic (z=6) lattice is studied. The phase diagrams, the behavior of critical points, transverse magnetization, internal energy, magnetic specific heat are obtained numerically and discussed for different values of p the concentration of the random transverse field.
متن کاملSpeech Recognition using Acoustic Landmarks and Binary Phonetic Feature Classifiers
In spite of decades of research, Automatic Speech Recognition (ASR) is far from reaching the goal of performance close to Human Speech Recognition (HSR). One of the reasons for unsatisfactory performance of the state-of-the-art ASR systems, that are based largely on Hidden Markov Models (HMMs), is the inferior acoustic modeling of low level or phonetic level linguistic information in the speech...
متن کاملDown-sampling speech representation in ASR
Features for automatic speech recognition (ASR) are typically sampled at about 100 Hz (10 ms analysis step). Recent experiments indicate that the most e cient components of the modulation spectrum of speech for ASR are up to about 16 Hz [1]. Consequently, RASTA processing attenuates modulation frequencies higher than 16 Hz and should in principle allow for a subsequent down-sampling of the feat...
متن کاملTransculturation and Multilingual Lives: Writing between Languages and Cultures
This paper looks at the issues of transculturation as explored in auto and semi-autobiographical accounts of linguistic and cultural transitions. The paper also addresses a number of questions about the structure of these texts, the authors’ linguistic competences, as well as questions about the theoretical and conceptual tool which may help us to discuss the issues the writers are reflecting o...
متن کاملExploiting Linguistic Knowledge in Language Modeling of Czech Spontaneous Speech
In our paper, we present a method for incorporating available linguistic information into a statistical language model that is used in ASR system for transcribing spontaneous speech. We employ the class-based language model paradigm and use the morphological tags as the basis for world-to-class mapping. Since the number of different tags is at least by one order of magnitude lower than the numb...
متن کامل